67 research outputs found

    Three myths about risk thresholds for prediction models

    Get PDF
    Acknowledgments This work was developed as part of the international initiative of strengthening analytical thinking for observational studies (STRATOS). The objective of STRATOS is to provide accessible and accurate guidance in the design and analysis of observational studies (http://stratos-initiative.org/). Members of the STRATOS Topic Group ‘Evaluating diagnostic tests and prediction models’ are Gary Collins, Carl Moons, Ewout Steyerberg, Patrick Bossuyt, Petra Macaskill, David McLernon, Ben van Calster, and Andrew Vickers. Funding The study is supported by the Research Foundation-Flanders (FWO) project G0B4716N and Internal Funds KU Leuven (project C24/15/037). Laure Wynants is a post-doctoral fellow of the Research Foundation – Flanders (FWO). The funding bodies had no role in the design of the study, collection, analysis, interpretation of data, nor in writing the manuscript. Contributions LW and BVC conceived the original idea of the manuscript, to which ES, MVS and DML then contributed. DT acquired the data. LW analyzed the data, interpreted the results and wrote the first draft. All authors revised the work, approved the submitted version, and are accountable for the integrity and accuracy of the work.Peer reviewedPublisher PD

    Covipendium : information available to support the development of medical countermeasures and interventions against COVID-19

    Get PDF
    The living paper on the new coronavirus disease (COVID-19) provides a structured compilation of scientific data about the virus, the disease and its control. Its objective is to help scientists identify the most relevant publications on COVID-19 in the mass of information that appears every day. It is also expected to foster a global understanding of disease control and stimulate transdisciplinary initiatives

    Predicting COVID-19 prognosis in the ICU remained challenging: external validation in a multinational regional cohort

    Full text link
    Objective: Many prediction models for Coronavirus Disease 2019 (COVID-19) have been developed. External validation is mandatory before implementation in the Intensive Care Unit (ICU). We selected and validated prognostic models in the Euregio Intensive Care COVID (EICC) cohort. Study design and setting: In this multinational cohort study, routine data from COVID-19 patients admitted to ICUs within the Euregio Meuse-Rhine were collected from March to August 2020. COVID-19 models were selected based on model type, predictors, outcomes, and reporting. Furthermore, general ICU scores were assessed. Discrimination was assessed by area under the receiver operating characteristic curves (AUCs) and calibration by calibration-in-the-large and calibration plots. A random-effects meta-analysis was used to pool results. Results: 551 patients were admitted. Mean age was 65.4±11.2 years, 29% were female, and ICU mortality was 36%. Nine out of 238 published models were externally validated. Pooled AUCs were between 0.53 and 0.70 and calibration-in-the-large between -9% and 6%. Calibration plots showed generally poor but, for the 4C Mortality score and SEIMC score, moderate calibration. Conclusion: Of the nine prognostic models that were externally validated in the EICC cohort, only two showed reasonable discrimination and moderate calibration. For future pandemics, better models based on routine data are needed to support admission decision-making

    Prediction models for diagnosis and prognosis of covid-19: : systematic review and critical appraisal

    Get PDF
    Readers’ note This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 3 of the original article published on 7 April 2020 (BMJ 2020;369:m1328). Previous updates can be found as data supplements (https://www.bmj.com/content/369/bmj.m1328/related#datasupp). When citing this paper please consider adding the update number and date of access for clarity. Funding: LW, BVC, LH, and MDV acknowledge specific funding for this work from Internal Funds KU Leuven, KOOR, and the COVID-19 Fund. LW is a postdoctoral fellow of Research Foundation-Flanders (FWO) and receives support from ZonMw (grant 10430012010001). BVC received support from FWO (grant G0B4716N) and Internal Funds KU Leuven (grant C24/15/037). TPAD acknowledges financial support from the Netherlands Organisation for Health Research and Development (grant 91617050). VMTdJ was supported by the European Union Horizon 2020 Research and Innovation Programme under ReCoDID grant agreement 825746. KGMM and JAAD acknowledge financial support from Cochrane Collaboration (SMF 2018). KIES is funded by the National Institute for Health Research (NIHR) School for Primary Care Research. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care. GSC was supported by the NIHR Biomedical Research Centre, Oxford, and Cancer Research UK (programme grant C49297/A27294). JM was supported by the Cancer Research UK (programme grant C49297/A27294). PD was supported by the NIHR Biomedical Research Centre, Oxford. MOH is supported by the National Heart, Lung, and Blood Institute of the United States National Institutes of Health (grant R00 HL141678). ICCvDH and BCTvB received funding from Euregio Meuse-Rhine (grant Covid Data Platform (coDaP) interref EMR187). The funders played no role in study design, data collection, data analysis, data interpretation, or reporting.Peer reviewedPublisher PD

    Erratum to: Methods for evaluating medical tests and biomarkers

    Get PDF
    [This corrects the article DOI: 10.1186/s41512-016-0001-y.]

    Evidence synthesis to inform model-based cost-effectiveness evaluations of diagnostic tests: a methodological systematic review of health technology assessments

    Get PDF
    Background: Evaluations of diagnostic tests are challenging because of the indirect nature of their impact on patient outcomes. Model-based health economic evaluations of tests allow different types of evidence from various sources to be incorporated and enable cost-effectiveness estimates to be made beyond the duration of available study data. To parameterize a health-economic model fully, all the ways a test impacts on patient health must be quantified, including but not limited to diagnostic test accuracy. Methods: We assessed all UK NIHR HTA reports published May 2009-July 2015. Reports were included if they evaluated a diagnostic test, included a model-based health economic evaluation and included a systematic review and meta-analysis of test accuracy. From each eligible report we extracted information on the following topics: 1) what evidence aside from test accuracy was searched for and synthesised, 2) which methods were used to synthesise test accuracy evidence and how did the results inform the economic model, 3) how/whether threshold effects were explored, 4) how the potential dependency between multiple tests in a pathway was accounted for, and 5) for evaluations of tests targeted at the primary care setting, how evidence from differing healthcare settings was incorporated. Results: The bivariate or HSROC model was implemented in 20/22 reports that met all inclusion criteria. Test accuracy data for health economic modelling was obtained from meta-analyses completely in four reports, partially in fourteen reports and not at all in four reports. Only 2/7 reports that used a quantitative test gave clear threshold recommendations. All 22 reports explored the effect of uncertainty in accuracy parameters but most of those that used multiple tests did not allow for dependence between test results. 7/22 tests were potentially suitable for primary care but the majority found limited evidence on test accuracy in primary care settings. Conclusions: The uptake of appropriate meta-analysis methods for synthesising evidence on diagnostic test accuracy in UK NIHR HTAs has improved in recent years. Future research should focus on other evidence requirements for cost-effectiveness assessment, threshold effects for quantitative tests and the impact of multiple diagnostic tests

    Erratum to: Methods for evaluating medical tests and biomarkers

    Get PDF
    [This corrects the article DOI: 10.1186/s41512-016-0001-y.]

    Clinical Risk Prediction Models based on Multicenter Data: Methods for Model Development and Validation

    No full text
    Risk prediction models are developed to assist doctors in diagnosing patients, decision-making, counseling patients or providing a prognosis. To enhance the generalizability of risk models, researchers increasingly collect patient data in different settings and join forces in multicenter collaborations. The resulting datasets are clustered: patients from one center may have more similarities than patients from different centers, for example, due to regional population differences or local referral patterns. Consequently, the assumption of independence of observations, underlying the most often used statistical techniques to analyze the data (e.g., logistic regression), does not hold. This is mostly ignored in much of the current clinical prediction research. Research that relies on faulty assumptions may yield misleading results and lead to suboptimal improvements in patient care. To address this issue, I investigated the consequences of ignoring the assumption of independence and studied alternative techniques that acknowledge clustering throughout the process of planning a study, building a model and validating models in new data. I used mixed and random effects methods throughout the research as they allow to explicitly model differences between centers, and evaluated the proposed solutions with simulations and real clinical data. This dissertation covers sample size requirements, data collection and predictor selection, model fitting, and the validation of risk models in new data, focusing mainly on diagnostic models. The main case study is the development and validation of models for the pre-operative diagnosis of ovarian cancer, for which the multicenter dataset collected by the International Ovarian Tumor Analysis (IOTA) consortium is used. The results suggested that mixed effects logistic regression models offer center-specific predictions that have a better predictive performance in new patients than the predictions from standard logistic regression models. Although simulations showed that models were severely overfitted with only five events per variable, mixed effects models did not require more demanding sample size guidelines than standard logistic regression models. A case study on predictors of ovarian malignancy demonstrated that in multicenter data, measurements may vary systematically from one center to another, indicating potential threats to generalizability. These predictors could be detected using the residual intraclass correlation coefficient and may be excluded from risk models. In addition, a case study showed that, if statistical variable selection is used, mixed effects models are required in every step of the selection procedure to prevent incorrect inferences. Finally, case studies on risk models for ovarian cancer demonstrated that the predictive performance of risk models varied considerably between centers. This could be detected using meta-analytic models to analyze discrimination, calibration and clinical utility. In conclusion, taking into account differences between centers during the planning of prediction research, the development of a model and the validation of risk predictions in new patients offers insight in the heterogeneity and better predictions in local settings. Many methodological challenges remain, among which the inclusion of predictor-by-center interactions, the optimal application of mixed effects models in new centers, and the refinement of techniques to summarize clinical utility in multicenter data. Nonetheless, the findings in this dissertation imply that current clinical prediction research would benefit from adopting mixed and random effects techniques to fully employ the information that is available in multicenter data.Acknowledgments iii Nederlandse samenvatting v Abstract vii Nomenclature ix Table of contents xiii List of figures xix List of tables xxv 1 General introduction 1 1.1 Outline of the thesis 3 1.2 Intended audience 4 2 Development and validation of risk models 7 2.1 Methods to develop and validate risk models 7 2.1.1 What is a risk model? 7 2.1.2 Formulating the research question 9 2.1.3 Study design and setup 9 2.1.4 Modeling strategy 12 2.1.5 Fitting the model 15 2.1.6 Validation of model performance 15 2.1.7 Reporting 21 2.1.8 Impact studies 21 2.1.9 Model implementation 22 2.1.10 Conclusion 22 2.2 Predicting successful vaginal birth after a Cesarean section 22 2.2.1 The clinical need for a new VBAC model 23 2.2.2 Subjects and methods 24 2.2.3 Results 26 2.2.4 Discussion 31 3 Statistical methods for multicenter data 35 3.1 Issues and opportunities of clustered data 35 3.2 Methods for clustered data 36 3.2.1 Ignoring the clustered data structure 37 3.2.2 Center-specific models 39 3.2.3 Correcting for clustering 45 3.2.4 Combining within-cluster results 46 3.2.5 Discussion 47 3.3 Multicenter data from the International Ovarian Tumor Analysis group 47 3.3.1 Background 48 3.3.2 The IOTA dataset 49 3.3.3 IOTA models and classification rules 54 3.3.4 The Simple Rules risk scoring system 59 3.4 Conclusion 63 4 Sample size for multicenter studies 65 4.1 Background 65 4.2 Design of the simulation study 66 4.2.1 The source populations 67 4.2.2 Sampling 67 4.2.3 Model building 69 4.2.4 Model evaluation 69 4.3 Results of the simulation study 71 4.3.1 Data clustering and the number of events per variable 71 4.3.2 Variable selection 73 4.3.3 Sample size 74 4.3.4 Random cluster effects correlated with predictors 77 4.4 Empirical example 77 4.5 Discussion 80 4.6 Conclusion 82 5 Predictor selection for multicenter studies 83 5.1 Screening for data clustering in multicenter studies: the residual intraclass correlation 83 5.1.1 Background 83 5.1.2 Methods 85 5.1.3 Results 91 5.1.4 Discussion 97 5.2 Statistical variable selection in clustered data 100 5.2.1 Background 100 5.2.2 Methods 101 5.2.3 Results 102 5.2.4 Discussion 104 5.3 Conclusion 106 6 Performance evaluation in multicenter studies 107 6.1 Heterogeneity in predictive performance 108 6.2 Performance measures for multicenter validation 109 6.2.1 Sensitivity and specificity 109 6.2.2 The c-statistic 111 6.2.3 Calibration 112 6.2.4 Net benefit 113 6.2.5 Explaining heterogeneity 114 6.2.6 Leave-one-center-out cross-validation 114 6.3 Some examples 116 6.3.1 The validation of IOTA strategies on phase III data: a meta-analysis of discrimination and calibration 116 6.3.2 The validation of the IOTA Simple Rules risks scoring system on phase III data: a meta-analysis of discrimination and a graphical assessment of calibration in specialized oncology centers and other centers 124 6.3.3 The validation of IOTA models and RMI in the hands of users with varied training on phase IVb data: a meta-regression of test accuracy 129 6.3.4 The validation of the clinical utility of models on phase III data: net benefit in specialized oncology centers and other centers 132 6.4 A meta-analysis of net benefit 137 6.4.1 Various fixed and random effects weights 137 6.4.2 Random effects meta-analysis of the net benefit: an example 141 6.4.3 Future research: a Bayesian approach 145 6.5 Conclusion 148 7 Does ignoring clustering in multicenter data influence the predictive performance of risk models? A simulation study 149 7.1 Introduction 149 7.2 A framework of performance evaluation of risk models in clustered data 150 7.3 Calibration slopes for marginal and center-specific logistic regression models 152 7.4 Simulation study 153 7.4.1 Design 153 7.4.2 Results 155 7.5 Empirical example 163 7.6 Discussion 165 7.7 Conclusion 168 8 General discussion 169 8.1 Implications and recommendations 171 8.2 Future research 173 Appendices 177 A1 Multiple imputation in the IOTA dataset 178 A2 Technical overview of the design of the EPV simulation study 179 A3 Examples of R code for the EPV simulation study 181 A3.1 Generation of source populations 181 A3.2 Sampling from the source population and model building within samples 182 A4 Additional results from the EPV simulation study 187 A4.1 Bias in the estimated regression coefficients 187 A4.2 Standard (population-level or “overall”) c-statistics and calibration slopes 188 A4.3 Bias in the estimated random intercept variance 189 A5 SAS macro to estimate the residual intraclass correlation 190 A6 Difference in net sensitivity and net amount of avoided false positives per 100 patients 202 A7 Center-specific case-mix and population models to generate a heterogeneous multicenter dataset 204 A8 Observed differences in center-specific decision curves in simulated data 207 A9 Results for meta-analyses of center-specific net benefit using various weights 208 A10 Additional results of the random effects meta-analysis of NB in IOTA data 209 A11 Formulas for the c-statistic and logistic calibration in a comprehensive framework 210 A12 Calibration with a biased estimate of the between-center variance 211 A13 Correspondence between predictions from the standard logistic regression model and marginalized predictions from the mixed effects logistic regression model 212 A14 R code for the simulation study of the impact of ignoring clustering on predictive performance 213 A15 Additional results for the simulation study of the impact of ignoring clustering on predictive performance 222 A15.1 Detailed calibration results of simulations with EPV=100 and ICC=20% 222 A15.2 Relation between the estimated random intercept variance and the calibration slope 223 A15.3 Calibration intercepts and c-statistics obtained with development samples with EPV 100 in a population with ICC=5% 224 A15.4 Calibration intercepts and c-statistics obtained with development samples with EPV 5 in a population with ICC=20% 226 References 229 Curriculum vitae 249 List of publications 251 Papers in international journals 251 Letters and replies in international journals 252 Conference abstracts in international journals 252 Unpublished conference contributions 253 Invited talks 255nrpages: 283status: publishe
    corecore